Similarity search optimization using recently-biased symbolic representation

نویسندگان

Tamer Hassan Abd El Salam

Zalinda Othman

Abdul Razak Hamdan

چکیده

Dimension reduction is one of the important requirements for a successful representation to improve the efficiency of extracting the attracting trend patterns on the time series. Furthermore, an efficient and accurate similarity searching on a huge time series data set is a crucial problem in data mining preprocessing. Symbolic representations have proven to be a very effective way to reduce the dimensionality of time series without loss of knowledge. However, symbolic representations suffer from another challenges promoted by the possibility of losing some principal patterns due to the impractical utilization of dealing with the whole data with the same weight. The methodology utilized in this paper is proposed to overcome symbolic representation pattern mismatch. Moreover, the data dimensionality is reduced by keeping more detail on recent-pattern data and less detail on older ones using modified sliding window controlled by the corresponding classification error rate. Experimental results were made on the UCR standard dataset comparing with the state of the art techniques. The proposed techniques showed promising results. Furthermore, practical experiments were made on the Egyptian stock market indices EGX 30, EGX 70 and EGX 100. The discovered patterns showed the accuracy and effectiveness of the proposed approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining

Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...

متن کامل

Enhancing the Symbolic Aggregate Approximation Method Using Updated Lookup Tables

Similarity search in time series data mining is a problem that has attracted increasing attention recently. The high dimensionality and large volume of time series databases make sequential scanning inefficient to tackle this problem. There are many representation techniques that aim at reducing the dimensionality of time series so that the search can be handled faster at a lower dimensional sp...

متن کامل

Composite Kernel Optimization in Semi-Supervised Metric

Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...

متن کامل

An Empirical Study of Similarity Search in Stock Data

Using certain artificial intelligence techniques, stock data mining has given encouraging results in both trend analysis and similarity search. However, representing stock data effectively is a key issue in ensuring the success of a data mining process. In this paper, we aim to compare the performance of numeric and symbolic data representation of a stock dataset in terms of similarity search. ...

متن کامل

Accurate Deep Representation Quantization with Gradient Snapping Layer for Similarity Search

Recent advance of large scale similarity search involves using deeply learned representations to improve the search accuracy and use vector quantization methods to increase the search speed. However, how to learn deep representations that strongly preserve similarities between data pairs and can be accurately quantized via vector quantization remains a challenging task. Existing methods simply ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Similarity search optimization using recently-biased symbolic representation

نویسندگان

چکیده

منابع مشابه

A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining

Enhancing the Symbolic Aggregate Approximation Method Using Updated Lookup Tables

Composite Kernel Optimization in Semi-Supervised Metric

An Empirical Study of Similarity Search in Stock Data

Accurate Deep Representation Quantization with Gradient Snapping Layer for Similarity Search

عنوان ژورنال:

اشتراک گذاری